Local transition gradients determine the global attributes of protein energy landscapes 
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The dynamical characterization of proteins is crucial to understand protein function. From a 
microscopic point of view, protein dynamics is governed by the local atomic interactions that, in 
turn, trigger the functional conformational changes. Unfortunately, the relationship between local 
atomic fluctuations and global protein rearrangements is still elusive. Here, atomistic molecular dy- 
namics simulations in conjunction with complex network analysis show that fast peptide relaxations 
effectively build the backbone of the global free-energy landscape, providing a connection between 
local and global atomic rearrangements. A minimum-spanning-tree representation, built on the base 
of transition gradients networks, results in a high resolution mapping of the system dynamics and 
thermodynamics without requiring any a priori knowledge of the relevant degrees of freedom. These 
results suggest the presence of a local mechanism for the high communication efficiency generally 
observed in complex systems. 
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In complex systems the behavior of the whole is hardly 
predictable from the fundamental laws of interactions of 
the single components [1 . Indeed, it is hard to reveal 
the intimate connection between the local properties of 
a complex system and its global behavior. This problem 
has been formalized in the context of complex networks 
[2] as a question of "navigability" ; i.e., the mechanism for 
the efficient flow of information when the single network 
nodes do not have a global view of the overall topology. 
As a matter of fact, many collective dynamical processes 
are driven by the presence of (usually hidden) local gra- 
dients [3]. 

The study of protein dynamics involve a similar prob- 
lem - the coupling between the fast atomic fluctuations, 
which are local, and the slow conformational changes, 
which are global jjj. Those dynamical aspects have been 
recently recognized to be crucial for protein function, 
playing an important role in signalling, allosteric path- 
ways and enzymatic reactions [SI IH] • Molecular dynamics 
(MD) simulations are playing an increasing role in com- 
plementing the experimental results [THl] which supply 
useful , but limited, information to this question. The 
recent combination of computational and experimental 
studies of a protein enzyme has pointed out that the fast 
atomic fluctuations are partly correlated to the displace- 
ments occuring in the catalytic reaction lilO,;. As yet, the 
coupling between the fast nanosecond timescales and the 
functional relevant transitions occuring in the microsec- 
ond to millisecond range largely remains obscure. 

Most descriptions of the free-energy surface governing 
protein dynamics have been rather qualitative because 
of the lack of proper order parameters and the intrinsic 
multidimensionality of the problem [11] [12] . These lim- 
itations have triggered the development of a completely 
new arsenal of tools inspired by network theory ^ISj . The 
essential idea is to map the protein trajectory, obtained 
by computer simulations or experiments, on a confor- 
mation space network (CSN), whose nodes represent the 



different microstates and whose links correspond to di- 
rect transitions between them |13H15) . This approach has 
been successfully applied to the study of peptide folding 
and structural transitions [14-19' , as well as to interpret 
electron transfer experiments [20j and time-resolved IR 
measurements [211 122j . 

In this letter, the relation between the local proper- 
ties of the free-energy landscape and its global architec- 
ture is investigated by MD simulations of a 4 residues 
peptide, (GlySer)2, and complex network analysis . In 
particular, when the CSN of a fully-atomistic peptide is 
reduced to the subgraph containing only one link per mi- 
crostate pointing towards the most probable transition 
(i.e. following the transition gradient), the presence of 
energy valleys and subvalleys and their equilibrium pop- 
ulations is naturally extracted as well as the hierarchy 
of transitions between them. Hence, the fast local mo- 
tions build up the backbone of the global communication. 
The observed coupling between local and global dynam- 
ical properties is expected to occur in a large class of 
complex systems. 

GlySer peptides have been used for quite some time as 
flexible linkers (and are known to show poor secondary 
structure) for polypeptide dynamics [321121]. MD simula- 
tions using the Langevin algorithm |25j and the implicit 
solvation FACTS 26, 27J have been performed. A tra- 
jectory of 280 ns at 340 K was obtained and snapshots 
were saved every 140 steps for a total of 10^ conforma- 
tions. During the simulation the peptide visits several 
different conformations characterized by an end-to-end 
distance between 3.1 and 12.7 A (Fig.jl]) indicating large 
structural fluctuations. 

The peptide microstates are defined as the inherent 
structures (IS), i.e. the potential energy minima, of the 
system [5H1 . They are calculated minimizing all the 
10^ snapshots along the trajectory, resulting in 3044 dif- 
ferent IS. The IS are a natural, physically meaningful!, 
partition into microstates (30. - .32j . The conformation 
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FIG. 1. Timeseries window of the end-to-end distance of the 
(GlySer)2 peptide. The peptide is extremely flexible, alter- 
nating between compact states (dee ~ 3.4 A) and extended 
conformations (dee ~ 10.8 A). 

space network (CSN) is built on top of this microstate 
definition: the nodes and the links are the microstates 
(i.e. the IS) and their direct transitions observed during 
the MD trajectory, respectively. The obtained network 
is weighted and it is equivalent to a classical transition 
matrix when the columns of the adjacency matrix (i.e. 
the network links) are appropriately normalized to one. 

The relation between fast local modes and global chain 
rearrangements is investigated by constructing from the 
full CSN, a new network with a reduced number of links. 
For each network node, the transition with the highest 
probability (excluding self interactions) is kept, and all 
others are deleted. These transitions define a gradient 
in the network dynamics and result in a partitioning of 
the network into several disconnected minimum span- 
ning trees [33], called here gradient-clusters for conve- 
nience. (Gradient networks were originally introduced 
by Toroczkai and Bassler to study jamming [3], though 
in their case the gradient is defined on the nodes as a 
quenched scalar field). Following the pathway defined by 
the most probable transitions leads to microstates lower 
and lower in free energy, resulting in a kind of "steepest 
descent pathway" on the free-energy surface. High en- 
ergy microstates, in the neighbor of a free-energy barrier, 
would connect either to one valley or another. As a mat- 
ter of fact, the network characterizing the free-energy sur- 
face is split into a set of disconnected minimum-spanning- 
trees representing the local attractors of the system dy- 
namics. 

To better understand the nature of the gradient- 
clusters (161 in total), a cut-based free-energy profile 
(CFEP) [31] is calculated on the CSN and compared to 
the output of the gradient-partition. The CFEP is based 
on a network flux analysis following the idea that the 
network regions of minimum flow correspond to transi- 
tion states [3iH35] . In Fig. [2] the calculated CFEP is 
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FIG. 2. (Color online) Cut-based free-energy profile for the 
(GlySer)2 peptide. Microstates assigned by the gradient- 
approach to the four most populated valleys are shown on 
the profile in different colors. The CFEP represents the free- 
energy surface projected on the cumulative partition function 
reaction coordinate Z [M], relative to a given reference mi- 
crostate (in this case the most populated one). In the inset a 
profile is calculated on a subgraph made by the microstates 
belonging to V4 (triangles) and a smaller gradient-cluster (cir- 
cles). 

shown. The profile reveals that the peptide free-energy 
landscape is rugged. Remarkably, the obtained gradient- 
clusters represent either a valley or a subvalley of the 
free-energy landscape (Fig. [2]). In Table |l] the population 
of the four most relevant free-energy basins detected by 
the gradient-approach is compared against the popula- 
tions calculated by the minimum-cut method |16| , which 
is one of the most accurate approaches for this type of cal- 
culation [H] . The results indicate that the populations of 
the gradient-clusters are accurate, effectively reproducing 
the correct thermodynamics of the system. This is partic- 
ularly relevant since the gradient-approach does not use 
any global property of the system, neither in terms of bar- 
rier heights or microstates energies. On the other hand, 
the CFEP and the minimum-cut method do perform a 
global analysis of the network. These observations sug- 
gest an interesting application of the proposed approach 
to automatically detect the presence of metastable states 
sampled by multiple short MD runs with the aim of build- 
ing simplified Markov models [37] . 

The gradient-approach can be applied in an iterative, 
heirarchical fashion, when the gradient-clusters are con- 
sidered themselves as nodes of a higher level CSN. In 
this case, the nodes and the links of the network are the 
gradient-clusters and the connections between them, re- 
spectively. The iteration can be applied recursively until 
all the microstates are merged into one cluster, which 
is represented as a minimum-spanning- forest. The tree 
structure obtained for the (GlySer)2 peptide is shown in 
Fig. [3] Link widths represent at which iteration step the 
edge was introduced. For example, the V\ and V2 valleys 
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Valley 


Gradient 
population 


Mincut 
population 


< dee > [A] 


Vi 


0.429 


0.428 


4.167 


V2 


0.120 


0.120 


4.253 


Vs 


0.068 


0.064 


10.960 




0.037 


0.051 


4.160 



^ To take into account of the entropic effects, gradient-clusters 
separated by free-energy barriers smaller than kgT/lO have 
been merged. 

TABLE I. Comparison of the populations of the found val- 
leys calculated by the gradient-approach or by the minimum- 
cut method ^ISj. Populations are defined as the sum of the 
number of occurencies calculated along the trajectory of the 
microstates assigned to a given valley. In the last column the 
average value of the end-to-end distance dee inside a valley is 
given. 



are merged at the first iteration, indicating that they in- 
terconvert rapidly. On the other hand, V4 is merged to Vi 
at the fourth iteration indicating that this is the slowest 
transition to Vi . The gradient-minimum-spanning- forest 
represents, in an intrinsically multidimensional fashion, 
the valleys of the free-energy landscape and their dy- 
namical organization in fast and slow relaxations. The 
iteration step at which two gradient-clusters are merged 
together is a kinetic mesaure of the dynamical distance 
between the two valleys. 

Concluding, we found that, for an all-atom peptide MD 
simulation, the fast local relaxations produce a partition 
of the conformational space into disconnected minimum- 
spanning-trees. This partition reproduces the organiza- 
tion of the free-energy landscape into valleys and sub- 
valleys with the correct populations. The iterative con- 
nection of those valleys into a minimum-spanning-forest 
recovers the global backbone architecture of the dynam- 
ics occuring on the free-energy landscape. A similar cou- 
pling between local and global dynamics is expected to 
take place in other complex systems. These results are 
relevant to investigate the still unclear relationship be- 
tween network structure and dynamics in transport pro- 
cesses ranging from metabolic pathways to air-travelling 
and the internet. 
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representation consistently represents the fast relaxations be- 
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crowiding only nodes with populations larger than 10"'^ are 
shown) . 
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